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tn r^ 1 i : i enVel0Pe *^ 0 P ro(ein & ne sequences and complete structural polyprotein sequences were used 
^SS^S^^T*™* P^ eMtic *™« ft"""" ^Aiphavirls. Tree topologies todicTted that 
£ n r c X£S 8 P havi ™ ses coo,d . bave arise » ta «ther the Old or the New World, with atTeast two 
Mtfon Tuld ,2? T nS t ^ K ° Uat fW tteir CUlrent AisMb ^on. The rime frame for a phavirus diversm" 
EK^cCESST'" 1 • Iike,ih00d lndicated that a-e nucleotide substitution 

™j£n?STZ ~ l « SS W "?' n 11,6 8enome - WW,e most * ees showed evolutionary relationships 
SSS 5 re^v .rSVH 1 r^ and Spe , deS « SCTeral to ^ ™* classification ^ 

D Tt £ vari ™ y „ e °K^ fts V' PllaVi 7l eS Sa,mo,, pancreas disease and slee Ptag disease virus 

probably a new alphavinis 4 a " d Feru ' alM «P res ™ts 



The family Togaviridae is comprised of two genera, Alpha- 
virus and Rubivirus (77). The genus Alphavinis contains at least 
24 species (77) that can be classified antigenically into seven 
complexes (4) (Table 1). As a genus, the alphaviruses are 
widely distributed throughout the world, inhabiting all of the 
continents except Antarctica. The geographic distributions of 
individual species are restricted because of specific ecological 
conditions and reservoir host and vector restrictions (22, 77). 

Members of the genus^/pfiaww* are typically maintained in 
natural cycles involving transmission by an arthropod vector 
among susceptible vertebrate hosts (60). Virus-host interac- 
tions may be highly specific, and sometimes only a single mos- 
quito species is utilized as the principal vector, as has been 
reported for many Venezuelan equine encephalitis (VEE) 
complex viruses (74), These specific virus-vector interactions 
may limit the distribution of many alphaviruses. Possible ex- 
ceptions to the presumption that all alphaviruses have an ar- 
thropod host are the newly identified salmonid viruses salmon 
pancreas disease virus (SPDV) (81) and sleeping disease vims 
(SDV) (69). These viruses have been isolated only from dis- 
eased Atlantic salmon and rainbow trout, respectively, and are 
not known to have arthropod vectors. It has been postulated 
that the sea louse, Lepeophtheirus salmonis, may play a role in 
the transmission of SPDV, but no evidence to support this 
hypothesis has been generated. Parasitic lice have been impli- 
cated in the transmission of the newly discovered southern 
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elephant seal alphavirus (SESV) from the coast of Australia. 
SESV has been grouped genetically with the Semliki Forest 
virus complex (32). 

The members of the genus Atphavints cause a wide range of 
diseases in humans and animals. Many Old World viruses, 
including the Ross River, Barniah Forest, Mayaro, o'nyong- 
nyong, chikungunya, and Sindbis viruses, cause an arthralgia 
syndrome (47, 52), while encephalitis is caused by VEEV, 
eastern equine encephalitis virus (EEEV), and western equine 
encephalitis virus (WEEV) in the New World. In addition to 
causing febrile illness in equines, pigs, and calves, Getah virus 
has been reported to potentially induce abortion or stillbirth in 
pregnant sows (20, 44). Highlands J virus causes dramatic 
decreases in egg production and mortality in domestic birds 
(13, 70). Seroprevalence data on many of the remaining alpha- 
viruses indicate that they infect people and/or domestic ani- 
mals but have unknown clinical manifestations or cause only a 
mild febrile illness (1, 29-31, 41, 63, 65). Interestingly, alpha- 
viruses causing similar disease symptoms are maintained under 
diverse ecological conditions and can have a widespread dis- 
tribution. For example, Mayaro virus is limited geographically 
to Latin America (46, 64) while o'nyong-nyong virus has never 
been identified outside of Africa (21, 33, 48). These two viruses 
cause almost identical clinical signs and symptoms. This un- 
usual epidemiological pattern seen among the various alpha- 
viruses presents some intriguing questions regarding evolution- 
ary relationships of the members of the Alphavirus genus, 
including the origins of the genus and subsequent geographic 
expansion of the genus and species. 

The alphaviruses are small, spherical, enveloped viruses with 
a genome consisting of a single strand of positive-sense RNA 
(22, 55, 60). The nonstructural protein genes are encoded in 
the 5' two-thirds of the genome, while the structural proteins 
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TABLE 1. Alphaviruses studied" 




• IE ,k* t0 "S.""* "S? rcp0rt of *• ICTV complexes, subtypes, and varieties acceding » the SIRACA (4) 

FvTfcta™ ^TOCV^taSS*i y ° nS " y ° nS Vi ™ ; BFV ' Bannah FOreSt *"* SAGVl Sa E iama ^ GETV.geUh 

type; CEC, chicken enbryo cells; gp, guinea pig; BHK, baby hamster 



Scmliki Forest vims; RRV, 
virus; NDUV, Ndumu Yin 

* Cell types: DEC, duck .,._..„ 

kidney; C6/36, A. albapiclus larvae; CHSQ214, Chinook 
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FTG. 1. Organization of the Alphavbw genome. Gene products and associated functions are indicated. 



arc translated from a subgenomic mRNA colinear with the 3' 
one-third of the genome (Fig. 1). Replication occurs within the 
cytoplasm, and virions mature by budding through the plasma 
membrane, where virus-encoded surface glycoproteins E2 and 
El are assimilated. These two glycoproteins are the targets of 
numerous serologic reactions and tests (e.g., neutralization and 
hemagglutination inhibition); the alphaviruses show various 
degrees of antigenic cross-reactivity in these reactions, forming 
the basis for the seven antigenic complexes, 24 species, and 
many subtypes and varieties of alphaviruses defined previously 
(4, 23, 62). The E2 protein is the site of most neutralizing 
epitopes, while the El protein contains more conserved, cross- 



Previous studies of the evolutionary relationships among 
alphavmises have relied on phylogenetic analyses of either 
partial or complete sequences from one or more of the seven 
protein genes (35, 73, 80). Overall, these studies have pro- 
duced relationships in agreement with the antigenically based 
approaches used traditionally for alphavirus classification (4, 7, 
77). For example, viruses in the VEE (49, 76), EEE (2, 75), and 
WEE antigenic complexes (80) have each been shown t'o be 
monophyletic (WEE complex for the envelope glycoproteins 
only). Addtttonally, phylogenetic studies have shown that most 
of the New World viruses in the WEE antigenic complex 
(WEEV, Highlands J virus, Fort Morgan virus, and Buggy 
Creek virus [a variant of Fort Morgan virus]) are descendants 
of an ancestral alphavirus that resulted from a recombination 
event; recombination combined the E2 and El envelope pro- 
tein genes from a Sindbis-Iike virus and the remaining genes 
from an EEEV-like ancestor (19, 80). The Old World sero- 
groups have been studied in less detail; the chikungunya, o'nyong- 
nyong, Semliki Forest, and Ross River viruses, belonging to the 
Semliki Forest virus complex, are monophyletic in some anal- 
yses and paraphyletic in others, with Middelburg vims falling 
into this group in some trees (73, 79). 

To provide a more complete understanding of the evolution- 
ary history and mechanisms of emergence of alphaviruses, we 
conducted a comprehensive examination of the evolution of 
the genus by sequencing most of the El envelope glycoprotein 
gene for representatives of all alphavirus species (77), as well 
as major antigenic subtypes and varieties (4). Using phyloge- 
netic methods, these sequences were used to reexamine the 
evolutionary history and systematics of the genus. 

MATERIALS AND METHODS 
Vim, preparation. The virus strains used in this study are listed in Table 1. 
Viruses were diluted and passaged on BHK-21 or Vero 76 cells at a low multi- 
phary o infection. After approximately 75% of the cells exhibited eytopathic 
' ^TtT !"™ f reSenl in ,he '"P"™*" ™» concentrated by precipitation 
pended in 150 uJ of TEN (Tris-EDTA-NaCt) buffer, and 2 ml of Trizol LS 



(Gibco BRL, Bethesda, MA) was added in preparation for RNA extraction in 
accordance with the manufacturer's protocol. 

RNA extraction and reverse transcription PCR. RNA was extracted from 
onejialf of each vrrus-Tnzol suspension in accordance with manufacturer's pro- 
locols as desenbed previously (8). cDNAs were synthesized finm the rn/ ^ 
using a po]y(T) oligonucleotide primer (T^V-Mlu; 5'-TTACGAATTCACCCG 
■ * OT ■ ™ "r* an, P lifi,ation *» Performed on the first-strand cDNA by 
-EL? 6 £S£? P" 1 "" and 8 f °™ ard P"' mer "fcs^nated "IOWA (5'-TACC 
CNTTYATGTGGGG-3'). Thisforward primer anneals ,o , highiy conserved 
sequence that encodes the putative fusion domain of the EJ protein, and this 
condor, allowed us to amplify most of the El glycoprotein gene from a wide 
variety of highly d.vergent alphaviruses. Amplification of the carboxy portion of 
the El gene and the 3' noncoding region utilized the following parameters- 30 
cycles of denaturation at 95T for 30 s, primer annealing at 4<r>C for 30 s, and 
extension at 72*C for 3 min. A 10-min final extension was used to ensure 
complete product synthesis. For the virus designated Ag80-663, for which the 
above primer pairs were unsuccessful, the T^V-MIu primer was used in con 
junction with primer E/V7514(+) (5'-ACYCTCTACGGCTRACCTRA-3') to 
amplify the enure 26S subgenomic message region. Sequencing was performed 
on this shram by gene walking using sequentially designed primers (see Table 2) 
Sequencing and genetic analysis. PCR products ranging in size from 1.1 to 1 8 
Kb were isolated from 1% agarose gels. The cleaned DNA fragments were either 
sequenced directly or cloned into prescript n SK (Stratagene. La Jolla. Calif.) 
that had been linearized with SnuL Resiriction enzyme Smal was included in the 
ligation reaction to reduce the rcligation of the vector upon itself. White bacte- 
rial colonies were screened for plasmids containing inserts of the correct size, 
rwo selected clones were sequenced by using plasmid-specifie T7 promoter and 
Ml j reverse primers. Additional internal sequence was obtained by using virus- 
specific : primers as indicated in Table 2. Sequencing was performed by using an 
Applied B,osystem S (Foster City, Calif.) Prism 377 sequencer and BigDye auto- 
mated DNA sequencing kit. Deduced amino acid sequences were aligned with 
those of other alphaviruses sequenced previously (Table 1) by using the PILEUP 
program ,n the University of Wisconsin Genetics Computer Group package (10) 
w,th manual refinements to preserve codon homology. Pairwise comparisons 
were performed with PAUP (61) and the GAP program within the Genetic 
Computer Group package. 

Phylogenetic analyses were performed on both the nucleotide and translated 
amino acid sequences for the El gene or complete 26S sequence by using the 
PAUP program (61). The heuristic algorithm was employed for the ' 
parsimony analysis. The neighbor-joining distance matrix algorithn 



TABLE Z Primers used in reverse transcription 
PCR and sequencing re<"-' : — 



E/V7514( +)*... 
ctl0247A(+).„. 

al0552(+) 

al0613(+) 

oi0720(+) 

K10550(+) t 

SIN10790(-r).... 
T 2S V-M!u(-)... 
T„V(-) 



•5'-ACYCTCTACGGCTRACCTRA-3' 

-5'-TACCCNTTYATGTGGG-3 ' 

..... J'-CAYGTNCCWTAYACVCAG-3 ' 
.....J'-CCCTAAATACGAAGGCTCC-3' 
.....J'-TAACAGCGGGAAATCGTTGC-3' 
.....^'-CACGTTCCGTACACGCAAG-3' 
— 5'-GTGGAAAAACAACTCAGG-3' 

-5 '■TTACGAATTCACGCGTjjV-3' 

^'-T„V-3' 



• Each number indicates the position of th- 
in^the full-length VEEV genome (26). 

^ Primer used to generate a PCR product for the Ag8C-663 virus 
Pnroer used specifically to sequence the Kyzylagach virus. 
_Y-CorT;R = AorG;N = A,CG,orT;W = AorT;V = A,C,oi 



TABLE 3. Aiphavirus El 3' NCR amplicon analysis 



Getah 

Sagiyama 
MeTri 
Una 
78V353I 

Mucambo (VEEV-IIIA) 

Tonate 

Bijou Bridge 

71D1252 

Cabassou 

Ag80-663 

Buggy Creeic 

Babanfci 



* NA. not avaifflhl* 



parameter and FS4 collections. Bootstrap resampling to 
1 ™ ,• „,? r* 1 "* °" ,hB * rou P ,n S s ^ «<» performed with 
i,uuu replicates (12). For the generation of a maximum-likelihood model for 
alphavirus sequence evolution, closely related sequences of many different 
strains of EEEV (2). VEEV (49). WEEV (80), Highlands J vim, (S) and 
chikungunya vims (48) were analyzed to avoid the effect, of superimposed 
nucleotide substitutions. Tree topologies determined previously were used to 
' Jn ratiD * »™« Sam™ values for the unevenness 
ss nucleotide sites using the PAUP pro- 



El 3' NCR amplification and sequencing. The 3' noncoding 
region (NCR) and El envelope glycoprotein gene were se- 
lected for genetic analyses to take advantage of conserved 
sequences described previously for primer annealing and PCR 
amplification (S). The El region has also been shown to be 
phylogenetically informative. Alphavirus cDNAs were synthe- 
sized by using an oligo(dT) primer containing a 3' clamp 
(TaV-Mlu). By using this primer and a primer from a con- 
served region of the El gene (ctl0247A), nearly all alphavirus 
genomes were amplified. The VEE complex virus AgSO-663 
(subtype VI) could not be amplified with the al0247A primer, 
but the entire 26S region of this strain was amplified by using 
TjjV-Mlu and E/V7514(+). This was the only alphavirus that 
required alternative amplification conditions (see Materials 
and Methods). An analysis of the genome at the al0247A 
primer binding site revealed that the primer site was a highly 
conserved region across the entire Alphavirus genus and was an 
exact match in strain Ag80-<563, making it unclear why this 
virus was unable to be amplified with this primer. 

All of the amplicons generated ranged in size from 1.1 to 1.8 
kb (Table 3), depending upon the length of the 3' NCR The 
shortest 3' NCRs belonged to the 78V3531 and Trocara vi- 
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ruses, and Bebaru virus contained the longest (Table 3) Some 
of the amplicons (Getah, Una, Babanki, and Trocara viruses) 
d id not co ntain the conserved alphavirus termini (5'-ATTTT 
GTTTTTAATATTTC-3') (45), indicating that perhaps the en- 
tire 3' NCR was not amplified. Trocara virus was unique in that 
the cd0247A primer used in the PCR amplification was pres- 
ent on both ends of the amplicon, suggesting a 3' NCR of only 
34 nucleotides. The use of a longer poly(T) oligonucleotide 
primer increased the likelihood of obtaining the entire 3' NCR 
(T^ compared to T„) but was still unsuccessful in some in- 
stances, including that of Trocara virus. However, based on the 
finding of George and Raju (16) that the classical 19-nt con- 
served terminal element is not essential for replication or virus 
maintenance, it is possible that some of these viral sequences 
that appear to be incomplete (because they lack the entire 
conserved 3' terminus) are actually complete. While there was 
considerable variability in the 3' NCR sequences, the El gene 
with the exception of the five or six 3' terminal codons, was 
more conserved among all of the alphaviruses. Most viruses 
were sequenced directly by using the al0247A and al0552(+) 
primers. However, several required additional primers and the 
Kyzylagach strain, a subtype of Sindbis virus, was unable to be 
sequenced with the universal internal primer and required 
virus-specific primers (Table 2). Occasionally, there were se- 
quence differences between isolates sequenced in our labora- 
tory and those in the GenBank database. The Mucambo (VEE 
subtype H1A), Tonate (VEE subtype HIB), 71D1252 (VEE 
subtype mC), and Ag80-663 (VEE subtype VI) viruses had 
djfferences, typically in the third codon and/or synonymous 
positions, that were most likely due to differences in passage 
history. The sequence analyses we performed utilized the iso- 
lates with the lowest passage histories available, which were 
generally lower than those of the isolates used to generate 
sequences already in the GenBank database. 

Direct comparisons of El gene seqnences. To determine the 
extent of relatedness of all established members of the genus 
Alphavirus, pairwise comparisons were performed by using the 
nucleotide and deduced amino acid sequences in the El gene 
coding region (Table 4). The C-terminal 5 to 10 amino acids 
and their codons were omitted from the analyses because they 
were highly divergent and could not be aligned reliably (many 
alignment scores for this fragment did not differ statistically 
significantly from jumbled alignments). In general, the per- 
centage of sequence divergence correlated inversely with sero- 
logic cross-reactivity (4). Viruses within a given antigenic se- 
rocomplex were usually genetically more closely related than 
viruses in different complexes. Those within a given antigenic 
complex typically had a nucleotide sequence divergence of less 
than 43% and an amino acid sequence divergence of less than 
44%, while interserocomplex comparisons usually exceeded 38 
and 40%, respectively. The Middelburg virus complex was the 
least divergent of the antigenic complexes, with only 33% nu- 
cleotide and 31% amino acid sequence divergence compared 
with some Semliki Forest virus complex viruses, such as Getah 
virus. In contrast, Trocara virus exhibited considerable se- 
quence divergence versus all other alphaviruses, with at least 
43% nucleotide and 47% amino acid sequence divergence. 
These data support the previous conclusion that Trocara virus 
probably represents a new antigenic complex in the genus 
Alphavirus (65). 
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BFV 
EEEV1 
EEEV2 
EEEV3 
EEEV4 
M7DV 
NDUV 
BEBV 
CHIKV 
•GETV 
MAYV 
ONNV 
RRV 

RRV (SAGV) 
• SFV 
SFV (Me Tri virus) 
UNAV ' 
VEEV(IAB) 
VEEV(IC) 
VEEV(ID) 
VEEV (IE) 
78V3531 (VEE-IF) 
EVEV (VEE-H) 
MUCV (VEE-IIIA) 
MUCV (Tonate virus; VEE-IIIB) 
MUCV (Bijou Bridge virus; VEE-IIIB) 



TABLE 4. El envelope glycoprotein gene seq u ence divergence a 



7IDI252 (VEE-niQ 
PIXV (VEE-IV) 
CABV (VEE-V) 
Ag80V (VEE-VI) 
AURAV 
FMV 

FMV (BCV) 

HJV 

SINV 

SINV (BABV) 
SINV(KZLV) 
SINV (OCKV) 
WEEV 



Amt of sequence divergence from: 



EEEV1 EEEV2 EEEV3 EEEV4 MtDV NDUV BEBV CH.KV CETV 



TROCV 

SDV 

SPDV 



Within antigenic complexes, different Alphavirw, species 
generally showed at least 21% nucleotide and 8% amino 

ffiWvf Cn ? ^T n ^° ne eXCCpti ° n was ^glades "His 
^' dlffered ^ 501,16 strai 'ns of VEEV by onlv 
10% at the nucleotide level and 3% at the amino acid level. Me 
Tn virus, which is associated with central nervous system dis- 
ease m Vietnamese children and is considered a new alphavi- 
rus on the basis of antigenic tests (18), differed by only 2% in 

r le0 " d .V eqUence and 1% in its a ™°° «id sequence 
from Semiib Forest v.rus. Different subtypes of a given alpha- 

™ r fir" C /c S 3% aUCle ° tidc and 1% am ' no «M 'se- 
quence duTerence (Sindbts vims) and as much as 25 and 13% 
respecuvely (Ross River and subtype Sagiyama viruses). The 
maximum divergence between subtypes was found in VEE 
subtype IF (78V3531), which differed from all other VEE 



rT *3 SeS "* 31 ,CaSt 2290 at the nuclcotid e sequence 
level and 19% at the amino acid sequence level. As in previous 

SSSJf e h Ag8M6 ? 1 StnUn ' a s P cdM conside «* distincTfrom 
VEEV) but was still quite distantly related 

The sequences of the two fish viruses were the most distinct 
genetically vvith at least 49% nucleotide and 59% amino acid 

fpntTn^Jr VCreUS ° ther a 'P havi ™«. SDV and 
SPDV had only 5% nucleotide and 2% amino acid sequence 

alphav"rus« ValUeS C ° mparab,e to 11,036 of subt yP es *<*bcr 
Phylogenetic analysts of El gene sequences. Initially phylo- 
genetic analyses were performed on the El gene region by 
S m n mUm ; Pa T 0ny 3nd nei g h bor-joining methods 
These methods produced trees with similar topologies differ- 
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Amt of sequence divergence from: 



mg primarily in the relationships among some serocomplexes 
and within the Semliki Forest complex. Viruses with inconsis- 
tent placement included the Barmah Forest, Middelburg 
Mayaro, Una, and Trocara viruses. Neighbor joining grouped 
the Baimah Forest and Ndumu viruses at the base of the 
Semliki Forest clade and placed Middelburg virus within the 
Semliki Forest group. In neighbor-joining trees, Trocara virus 
was basal to the WEE complex, which grouped with the 
EEEV-VEEV ciade (Fig. 2). Maximum parsimony placed 
Middelburg virus outside of the Semliki Forest virus clade 
without transversion weighting and placed Trocara virus at the 
base of a nonfish alphavirus clade (not shown). The placement 
of the Cabassou and Pixuna viruses within the VEE complex 
was also inconsistent when different methods were used. In 
general, analyses using amino acid sequences generated results 



Continued on foUowing page 

similar to those described above, with diminished resolution 
within some terminal groupings due to loss of informative 
synonymous nucleotides. When all of the methods were used' 
midpoint rooting placed the fish virus clade at the base of the 
alphavirus tree, indicating that these viruses probably diverged 
from the mosquito-borne alphavirases very early in the evolu- 
tion of the genus. 

In an attempt to resolve the topological discrepancies de- 
scribed above, the maximum-likelihood method was used, based 
on a sequence evolution model derived from previously pub- 
lished detailed analyses of many strains from a given Alphavi- 
rus species or complex (2, 8, 43, 48, 49). Maximum-likelihood 
analysis of these data sets provided a mean estimate of 4.5 for 
the Ti/Tv ratio and a mean gamma value of 0.24, When these 
values were used, the topologies generated by the maximum- 
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TABLE 4- Continued 



EVEV 
(VEEV-II) 


MUCV 
(VEEV-niA) 


0.5 1 


0.50 


0.37 


0.37 


0.39 


0.38 


0.37 


0.37 


0.37 


0.37 


0.48 


0.48 


0.51 


0.53 


0.49 


0.51 


0.48 


0.51 


0.4S 


0.48 


0.47 


0.49 


0.47 


0.49 



MUCV 
(Tonate virus; 
VEEV-IHB) 



MUCV (Bijou 
Bridge virus; 
VEEV-HIB) 



V-V) (VEEV-V1) AURAV FMV »™V 



0.50 0.49 0.50 



0-52 0.54 0.54 



0.55 0.53 0.53 
0.52 0.53 0.53 
0.54 0.57 0.56 



0.50 0.49 0.49 
0.50 0.49 0.49 
0.50 0.49 0.50 



0-50 0.51 0.51 



0.41 0.30 0.31 



030 0.53 054 



parsimony and neighbor-joining methods were evaluated bv 
using a Kishino-Hasegawa likelihood test (61). The topology 
generated by the neighbor-joining method, shown in Fig. 2 was 
sign.ncar.tly more likely (maximum likelihood, P < 0.03) than 
the topologies generated by the maximum-parsimony method. 

Pbylogenetac analyses of complete structural gene sequences. 
In an attempt to resolve further some of the discrepancies in 
the tree topologies generated from partial El protein gene 
sequences, we analyzed the complete nonstructural and struc- 
tural protein gene sequences available for all aJphavirus s D e- 
ces by using the methods described above. Included in this 

IwnTr^-?"? 1 StmM polyprotein s ^ ea « <* 
ifcJ>V (32). Individual genes were also analyzed, and no evi- 
dence of recombination (topology differences supported by 
bootstrap values of 80% or greater), aside from the recom- 



on facing page 

binant WEEV-Highlands J virus-Fort Morgan virus group 
descnbed previously (19, 80), was detected. Structural poly- 
protein gene trees were consistently more robust than those 
constructed from nonstructural genes and also included more 
aiphavinis representatives. Therefore, we focused on the struc- 
tural polyprotein gene analyses. 

T ^ Cne . raled "? ^ to niaximum-parsimony and 
neighbor-joining methods had identical topologies, except for the 
placement of Middelburg virus, which fell within the Semliki 
Forest complex when the neighbor-joining method was used and 
was basal to the Semliki Forest virus complex when the maxi- 
mum-parsimony method was used. The rwighbor-joining tree 
generated by using amino acid sequences, which had higher boot- 
strap values than all others, is shown in Fig. 3. Because this tree 
had robust groupings for the VEE complex, we applied the VEE 
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topology to the partial EI protein gene sequence analysis (Fie. 2) 
and compared the maximum-likelihood values generated for both 
toe El and structural polyprotein topologies. The likelihood ra- 
tios mdicated that the neighbor-joining topology generated by 
using structural polyprotein sequences was as likely as the original 
topology generated with El nucleotide sequences. The original 
El topology, which placed Gbassou virus at the base of the VEE 
clade, with Pixuna virus a sister group to VEEV and EVEV, was 

Z TfT^ 1 ^'* when 0nr X ^ n<x Moa ««« 

was used (P > 03). Therefore, we believe that Hg. 2 represents 
the most accurate topology available for the genus Alphovirus. 

T 6 m0re Cleariy the ^rs " the com- 
J*^ Pofyprotem analyses than in the trees generated 
torn partial El sequences, providing stronger evidence that they 



would represent the basal clade in a rooted tree, as indicated in 
the midpoint rooted trees (Fig. 2). SESV also appeared to be 
quite distort genetically from all of the mosquito-bo™ alphavi- 
juses, with an amino acid sequence divergence level equivalent to 
to of a distinct antigenic complex. However, the distance of the 
SESV branch could be somewhat misleading if the missing re- 
gions (part of the capsid and El protein sequences) are less 
divergent than the included sequence regions (E3, E2, and 6K). 



Evolutionary origin of the alphavinises. Previous analyses of 
evolution suggested that the genus originated in the 
New World from an insect-borne plant virus (35, 73 79) The 
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rfifrft TfV ^ consistent ™th this hypothesis. Ex- 
reS M 1\f d S6al 3 NCW World origin would 

require at least three transoceanic introductions between the 

^ Nd^rJ^^r ° f ' hC ancestor of ^ Barmah F °r- 
est-Ndumu-Middelbure-Semliki Forest virus complexes from 

? ^K rid l ° 11,6 ° ld W0rId > W transport of L ancesto" 
of the Sindbis and Whataroa viruses to the Old World, and (iii) 

t h om ^ t 6 anCeSt ° r ° f 1116 Ma y aro and U ™ ^ses from 
fte Old World to the New World (Fig. 2). However, an 0!d 
World origin 1S also consistent with three transoceanic intro- 
ductions between the hemispheres: (i) transport of the ances- 
nl w ^^"rus-WEE-EEE-VEE complexes from the 
Old World to the New World; (ii) transport of the ancestor of 
the Sindb.s and Whataroa viruses to the Old World, and (iii) 
£ e an f stor °f ^e Mayaro and Una viruses from 
the Old World to the New World (Fig. 2). These equally par- 
XT SCenan0 r S f " 0t f3V0r C1 ' ther ove^the 
other. An ancestral alphavirus presumably adapted to fish in 
he dis ant past to form the SDV-SPDV lineage. The possible 

ssn\" ESV by insects (iice) st ™^™ 

esis that alphavruses arose as insect-borne or insect viruses. 

JlTr T matM P ' aCed thS ° ri 8 in 0f the alphaviruses sev- 
eral thousand years ago (73, 79). However, the methods em- 



ployed previously relied on the assumption of an equal rate of 
substitutes across nucleotide or amino acid positions in the 
alphavirus genome. Our data clearly indicate that this assump- 

icn is mvahd; aH estimates of the uniformity of nucleotide 
changes across sites are far from uniform, with an average 
gamma value of only 0.24 for those viruses examined in detd] 
(range, 0.05 to 0.31). This nonuniform^ in nucleotide subS- 
tuuons across sites, combined with the saturation of nucleotide 
changes in many positions, indicates that estimates on the 
order of thousands of years ago for the alphavirus ancestor are 
iar too recent. An accurate time estimate for the alphavirus 
progemtor may be impossible due to these factors. AnS™ 

K r ° blemS I* eStimating tateraaI branch l">8to 
is illustfatcd by our analysis of the recombination event be- 

WEEV Fort Morgan virus-Highlands J virus group (19, 80). 
The inter.or branch lengths produced with most of the phvlo- 
genetic methods yielded different horizontal positions for the 
internal branches shown previously to represent the recombi- 
nant ancestors (80) (Fig. 2). The fact tha? these ancestorTdid 
not occur at the same horizontal position (the dashed line in 
F lg . 2 cannot be drawn vertically) indicates error in the internal 
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toS diff" ° f eIther ^ " ^ com P>" dade or 

rJ^T .^°j 0 S° us sequences for the structural proteins 
cannot be identified m viruses outside the genus Alphavirus 

* I which comprises the other S eMS 

«*0 in the family rqproMfe, our tree, could not be rooted 
by using an outgroup. If midpoint rooting is used, the fish vi- 
ruses are consistently placed at the base of our alphavirus 
trees; his rooting relies on the assumption of a constant rate 
of evolution across different lineages in the tree. Tfte WEE 
complex recombination example described above implies that 
wnn^?' 10 " 'i n0t C ° mp!etely COrrect and W fat an 
™s fme ee * m ° St aCCUratC re P resenta «™ of genus 
Mechanisms ot Alphavirus radiation. Previous studies of A!- 
P I^Z dlV ™ ficati ° n h ™ ""Phased host switching events 
S 49 73 «m ; mrOduCti0nS in the ^iuten of the genus ft 
Senv IT } " E T ,1 . natlon of the «""Pl«e phy- 
logeny confirms the importance of these mechanisms The 
Alphas phytogenies also show numerous exampIeTof host 



switching events, such as the presumed introduction of EEEV 
into North America, accompanied by switching from Culex to 
CuhseK mo , qu , vectors (73). EVEV was presumably^ 
duced into Flonda from Central or South America and adarl 
ed to Culex cedec* which occurs only in North AmeriL 
Chikungunya virus is believed to have originated in East Africa 
m a nonhuman primate-sylvatic Aedes mosquito transmission 

Zl t^l"* - n /^ aCed fat0 M ' m aIon S with ^ urban 
vector^, aeS yp tl (4S ). O'nyong-nyong virus is believed to 
have evolvedfromachikungunya-like virus that adapted to,4«o- 
phdks mosquito vectors, a unique trait among alphaviruses (48) 
The diversity exhibited by alphavirus groups may be influ- 
enced strongly by host mobility. Viruses that utilize reservoir 
hosts with limited mobility, such as small mammals, tend to be 
quite diverse and have nonoverlapping distributions. The best 
examples are the VEE complex viruses, which use primarily 
rodent hosts and Culex (Melanoconion) mosquito vectors with 
a limited flight range. VEE complex viruses occur nearly 
throughout the neotropics and subtropics, but the distributions 
of the various subtypes are discrete, for the most part A sim- 
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RcL 6 ?^ 0 - 0 ^ 1 Ph r° men0n is Seen mn S 'he isolates of 
Ross River virus from Australia (37). Viruses that use birds as 

m TT' h0StS ' SUCh 35 Sindbis viras . EEEV, and WEEV 
*pe ends to occupy a greater geographic range (37, 54). Host 
mobility presumably limits virus diversity by pr venting geo 
graphic isolat.on and aliopatric divergent and by faring 

rrCe^l U T ° f CiOSdy re ' ated VinKCS ,ha < " e 
over large geographic ranges. 

Atphavirus paries. Imtialry.^A™™ classification was 



defined bv thV q T t r au *'"V mmna classification was 
£2? b ^ ' he u Subconunitt « on Interrelationships Among 
Sl^^T" (SIRACA > 0f the American Com2 
ee o„ Arthropod-Borne Viruses, which relied completely on 
antigenic cross-reactivity in tests such as hemagglutination 
hib.tion complement fixation, and neutraJizatioT(4 7) Tnese 

eacn^heTfh membelS Reactivity S 

j2L2 M t0 memberS ° f ° ther """P 1 "*- Different 

^ m C T - KaCtMt * k both directi °ns (one 
vrrus reacted against antibody from a second, and the second 
«™ ; reacted against antibody produced against the first) com 
pared to homologous (a given virus reaped against an ioSy 
EfS^-*" 10 reacfion, S£2 

L oniv whf T" f ° Urf0ld diffcrenCeS in ° ne d ^<=- 
oon only while antigenic varieties were distinguishable only 

rT^ I K tCrnati ° nal Committee O" Taxonomy of Viruses 
has limited ,ts class.ficat.on to species within the g enus ( no 
complexes or subtypes are defined). Currently, the IcTV de- 

niche 67, 68). This defimtion includes additional criteria in 
company to the SIRACA classification, but this^ Ss " 
more subjective mterpretation in some cases. For example, 

f^nirT^. considered a species dktinct VEEV 
(77) (Table 1), although the SIRACA classification includes it 

S f ™ f , k , Wlth,n th , C VEEV sub, yP e *AB/C/D clade (49, 
Wn A P ^ 1 atUra! c,assifi ^tion would not include this 
variant Jwpv y CtIC ^ md wouM consider EVEV a 
7SV3531 (Fig. 2 . However, EVEV clearly constitutes a repU- 
cat ng lineage (,t occurs only in Florida and is genetically 
eSo ^ °" this H^ibution) and occupies a^rt ££ 
t%t* t ? ,ClK f °* h 3 mo ^° vector differ- 

ent from those of all other VEE complex viruses). Also, EVEV 

HZ ■ ?■? Wi ' h the Cmer ^ nce of epidemic and 

SSST h D and IE viruses (74) - Synon ^- 

altWh tll?« ^ 35 beC " P revio,wl y Proposed (27, 39); 

™ ? 7 man , y £heoretical respects - ^ w °" id h ^ 

important practical imphcations due to biological safety rec- 
(66) - additi ° naJ CX3mp,e °f 'he dSlS 
of fp I0 " and taXOn ° mv * the edification 
of Barmah Forest vrnis in the family Bimyaviridae based on 
S* ra ft 3g )- However, subsequent genetic char- 
acterization revealed it to be amember of the ^Lirw genus 
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Xote^e— ^^^".^-.eicacid 
,„?! S f.! ^ fundamental differences between the antigenic 
vT'^f SP f CS definiti ° nS ' the hematics of the afpha- 
vnnses developed on antigenic grounds alone (4) aeree re 
markably well with those of the ICTV (77). The moreTtaUed 
leadto SIRACA classification of antigemcsub^jpe^can 
lead to mmor genetic changes that have a dramatic affect on 
a^emcity and thus the rapid appearance of new 
e^mple K a„ subtype of EEEV isolated from a hu- 

man ,„ Mississippi in 1983 (5). Although this strain met anti- 
genic criteria as a subtype, genetic analyses demonstrated that 
minor genet.c changes resulted in the addition of an N-linked 
grycosylation site m the E2 protein (78). Although there was no 
evidence that this genotype persisted beyond 1983, these kinds 
of amigemc changes could be epidemiological!; i^Sf 

siir^ 1 :^ whcre ° niy ° ne ° r « 

^sutut.ons in the E2 envelope glycoprotein can result in the 
generanon of subtype IC equine-virulent strains from enzoo^c 

ZZTZ Ient ?T ID proeenitors (72 >- ^ 

TZ^Z K e f CtS ° n P ath0 &™* and host range, 

leading to epizootics. A completely natural classification would 
not distinguish these subtypes because they are paraphvletic 
and the epizootic viruses do not appear to LstitSnS 

heoreS f " ""T" ^ d ™^or* JLl balance 
theoretical and practical considerations 

D h^^Tr mP ' eMS - ' n,e MVen anti S enic ^^P'^es of al- 
that share med.cally important characteristics. For example 
members of the EEE and VEE complexes share encephl c ' 
potential , n and humanSj whi]e ^ Sem!ikl . p 

dZ, C °^ P ™ S gMy produce M artb ^gi e syn- 
drome. The groupmg of Barmah Forest virus with the Scmhki 
Forest virus complex viruses is consistent with their sharin, this 
pathogenic trait The WEE complex includes viruses 2fJ£ 

twEv a Tmt ?f? r s - like c!ade > Md 

1^ T H,ghJands J V1 ™ s > ^dromes. WEEV and High- 
S J en V r sa r e , descendants °f a recombinant alphavirus, Z 
Snrrih f P f UC P ° tCntla ' P res « mab) y ^fleets the genetic 
rFtpvr, (n ° nStn,CtUraI Pr0teim ' Capsid - and 3' NCR) of 
the EEEV-Iike ancestor rather than the Sindbis virus-like J*. 

SSZF? (1% m ' ^ ° nly facon toy of th tab- 
Ushed Alpha*™ complexes with evolutionary relationshTps J 
Midddburg wrus, which is classified as a separate ant£fc 
complex based on antigenic relationships (4). While there are 
very few folates available and the epidemiologic ^aSs 

Se Semli^F ^ ™* b < a 2 

the Semliki Forest wrus complex clade (Fig. 2 and 3) 

Interestmgfy, serological characterizations may provide some 
_ ... - - "'onocionai antibodies generated against the 



Semliki Forest Virus nuc.ocaps^ ^s™ JSJ * ™ 
tnre assays they cross-react with members of the Semliki For^ 

£3? h EEEV ' Viros > and Ndumu vSs 

complexes but do not cross-react at all with VEEV or B a ™Ti! 
Forest virus (H). As the nucleocapsid S^f™ 
conserved vinon proteins, this may reflect some ancient rela- 
tionships among the alphaviruses. 
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Recommended revisions of the genus Alphavirus. Although 
our phylogenctic data generally supported the current Alpha- 
virus classification, severaJ discrepancies were noted (i) Virus 
strain 78V3531 (VEE subtype IF according to SIRACA) is 
quite distinct phylogeneticaily from VEEV, and its closest rel- 
ative is AggO-^46 (AgSOV). Although its transmission cycle has 
not been characterized and its niche cannot therefore be eval- 
aated, this virus probably warrants species designation based 
on the clear distinction of its genetic lineage, its isolation in 
a part of Brazil not known to be inhabited by other VEEV 
f?™ P 'f X . a ! phaviruscs ' and its antigenic distinction (3). Unlike 
VEEV, it is also avirulent for adult mice and is not associated 
with VEEV outbreaks, (ii) Tonate vims, a member of VEEV 
complex subtype III, is quite distinct from the other members 
of subtype III, with at least 16% nucleotide and 1% amino acid 
sequence divergence (Table 4). In addition to their antigenic 
differences, the Tonate and Mucambo viruses apparently use 
different reservoir hosts (birds and small mammals, respec- 
tively) (71). They should probably be considered distinct spe- 
cies. TTje Bijou Bridge strain from western North America, also 
a bird virus and apparently transmitted by nest bugs (40) is 
appropriately considered a strain of Tonate virus due to 'its 
genetic similarity and similar niche, (iii) Although its transmis- 
sion cycle remains obscure, Trocara virus also appears to be a 
new Alphavirus species based on genetic distinctions from all 
other species (65). The antigenic comparisons suggesting that 
Trocara virus represents a new antigenic complex are not as 
comprehensive as our sequence comparisons, and cross-reac- 
tions with members of several Alphavirus serocomplexes were 
very weak, (iv) Me Tri virus, originally reported to be a new 
Alphavmu based on antigenic criteria, is genetically very close 
to Semhki Forest virus and does not appear to constitute a 
separate lineage (although lineage is a rather arbitrary term)- 
its genetic distance from Semliki Forest virus is similar to the 
dtstances among other Alphavirus subtypes or strains and it 
should probably be considered a subtype or strain of Semliki 
Forest virus, (v) Sagiyama virus, considered by SIRACA to be 
a subtype of Getah virus, along with Ross River virus and 
Bebaru virus (4), and considered a subtype of Ross River virus 
in the most recent ICTV classification (77), is much more 
closely related to Getah virus than to Ross River or Bebaru 
virus Based on our genetic data alone, the Ross River, Bebaru, 
and Getah viruses should be retained as distinct Alphavinis 
species but, as suggested by Shirako and Yamaguchi (57) 
Sagiyama virus should be considered a subtype of Getah virus' 
(vi) Kyzylagach virus, which was originally isolated in Azerbai- 
jan and was recently identified in China (36), appears to be one 
of the most distinct subtypes of Sindbis virus yet identified The 
genetic data indicate that it could be classified as either a 
subtype of Sindbis virus or a distinct species (18% divergence 
at the nucleotide level and 6 to 8% divergence at the amino 
acid level). Additionally, of all of the viruses analyzed in this 
study, this is the only virus that could not be sequenced with 
the degenerate alphavirus sequencing primers; it required spe- 
cies-specific primers. Because SIN viruses are usually transmit- 
ted among avian hosts and maintain a high degree of genetic 
homogeneity, the fact that Kyzylagach virus exists in a lineage 
so independent from all other SIN viruses suggests that it could 
be classified as a distinct species, (vii) SDV and SPDV al- 
though not yet compared antigcnically to the alphaviruses (69 
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81), also appear to represent a distinct complex based on their 
sequence divergence. They clearly occupy dramatically differ- 
ent niches and genetic lineages from all remaining alphavi- 
ruses, indicating that they are not variants of an established 
species. However, the very small amount of sequence diver- 
gence between the two fish viruses suggests that SDV is really 
a strain or subtype of the nove\ Alphavirus species SPDV (viii) 
SESV also represents a new Alphavirus species, as reported 
previously (32). It appears to be quite distinct genetically from 
all of the mosquito-bome alphaviruses, with the amino acid 
sequence divergence level of a distinct antigenic complex. 
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